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Abstract 

We consider molecular communication, with information conveyed in the time of release of molecules. 
The main contribution of this paper is the development of a theoretical foundation for such a communica- 
tion system. Specifically, we develop the additive inverse Gaussian (IG) noise channel model: a channel 
in which the information is corrupted by noise with an inverse Gaussian distribution. We show that 
such a channel model is appropriate for molecular communication in fluid media - when propagation 
between transmitter and receiver is governed by Brownian motion and when there is positive drift 
from transmitter to receiver. Taking advantage of the available literature on the IG distribution, upper 
and lower bounds on channel capacity are developed, and a maximum likelihood receiver is derived. 
Theory and simulation results are presented which show that such a channel does not have a single 
quality measure analogous to signal-to-noise ratio in the AWGN channel. It is also shown that the 
use of multiple molecules leads to reduced error rate in a manner akin to diversity order in wireless 
communications. Finally, we discuss some open problems in molecular communications that arise from 
the IG system model. 



I. Introduction 

Modem communication systems are almost exclusively based on the propagation of electro- 
magnetic (or acoustic) waves. Of growing recent interest, nanoscale networks, or nanonetworks, 
are systems of communicating devices, where both the devices themselves and the gaps between 
them are measured in nanometers [1]. Due to the limitations on the available size, energy, and 
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processing power, it is difficult for them to communicate through conventional means such 
as electromagnetic or acoustic waves. Thus, communication between nanoscale devices will 
substantially differ from the well known wired/wireless communication scenarios. 

In this paper, we address communication in a nanonetwork operating in a aqueous environment; 
more precisely, we consider communication between two nanomachines connected through a fluid 
medium, where messages are encoded in patterns of molecules. In this scheme, the transmitter 
sends information to the receiver by releasing molecules into the fluid medium connecting 
them; the molecules propagate through the fluid medium; and the receiver, upon receiving 
the molecules, decodes the information by processing or reacting with the molecules. This 
method, known as molecular communication [2], is inspired by biological micro-organisms which 
exchange information through molecules. Information can be encoded on to the molecules in 
different ways, such as using timing, concentration, or the identities of the molecules themselves. 

Molecular communication has recently become a rapidly growing discipline within commu- 
nications and information theory. The existing literature that can be divided into two broad 
categories: in the first category, components and designs to implement molecular communication 
systems are described; for example, communications based on calcium ion exchange [3] and 
liposomes [4] have been proposed. These are commonly used by living cells to communicate. 
Other work (e.g., [5], [6]) has explored the use of molecular motors to actively transport 
information-bearing molecules. To date, a considerable amount of work has been done in related 
directions, much of which is beyond the scope of this paper; a good review is found in [7]. 

In the second category, channel models are analyzed and information-theoretic capacity ob- 
tained, largely via simulations. Our own prior work falls in this category: in [8], idealized models 
and mutual information bounds were presented for a Wiener process model of Brownian motion 
without drift; while in [9], [10], a net positive drift was added to the Brownian motion and 
mutual information between transmitter and receiver calculated using simulations. Aside from 
our own work, mutual information has been calculated for simplified transmission models (e.g., 
on-off keying) in [11], [12]; while communication channel models for molecular concentration 
have been presented in [13], and mutual information calculated in [14]. Less closely related to 
the current paper, information-theoretic work has also been done to evaluate multiuser molecular 
communication channels [15], and evaluate the capacity of calcium relay channels [16]. Related 
work also includes information-theoretic literature on the trapdoor channel [17], [18], and the 
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queue-timing channel [19], [20]. 

Building on the work in [10], in this paper, we consider a molecular timing channel in 
the presence of Brownian motion with positive drift. Brownian motion is physically realistic 
for nanodevices, since these devices have dimensions broadly on the same scale as individual 
molecules; and we choose positive drift since it arises in our applications of interest (e.g., 
communications that takes advantage of the bloodstream). Our focus here is on the channel; 
we assume that the transmitter and receiver work perfectly. We assume the receiver has infinite 
time to guarantee that all transmitted molecules will arrive and that there are no "stray" particles 
in the environment. Therefore, in our system, communication is corrupted only by the inherent 
randomness due to Brownian motion. 

The key contributions of this paper are: 

• Most importantly, we show that a molecular timing channel can be abstracted as an additive 
noise channel with the noise having inverse Gaussian (IG) distribution (Section II); thus, the 
molecular communication is modeled as communication over an additive inverse Gaussian 
noise (AIGN) channel. This forms the basis of the theoretical developments that follow. 

• Using the AIGN framework, we obtain upper and lower bounds on the information theoretic 
capacity of a molecular communication system (Theorem 1). 

• We investigate receiver design for molecular communication and present three key results: 
A maximum likelihood estimator (Theorem 2) and an upper bound on the symbol error 
probability (Theorem 3). We also show an effect similar to diversity order in wireless 
communications when multiple molecules are released simultaneously (Theorem 4). 

While the work in [10] is based largely on simulations, the AIGN framework developed here 
allows us to place molecular communications on a theoretical footing. However, we emphasize 
that this paper remains an initial investigation into the theory of molecular communications in 
fluid media. 

This paper is organized as follows: Section II presents the system and channel model under 
consideration. Section III then uses this channel model to develop capacity bounds for this 
system. Section IV then develops a maximum likelihood (ML) receiver. Section V wraps up the 
paper with extensive discussion, a few open problems and some concluding remarks. 

Notation: h(X) denotes the differential entropy of the random variable X. X ~ exp(7) 
implies that X is an exponentially distributed random variable with mean I/7, i.e., fx{x) = 
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Fig. 1. System Model with transmitter at to = and receiver at to = d 



7exp~''''^,x > 0. C{X) denotes the Laplace transform of the the probability density function 
(pdf) of the random variable X. Throughout the paper, log refers to the natural logarithm, hence 
information is measured in nats. 

II. System and Channel model 

Let W{x) be a continuous-time random process which represents the position at time x 
of a molecule propagating via Brownian motion. Let Q < xi < X2 < ■ ■ ■ < represent 
a sequence of time instants, and let Ri = W{xi) — W{xi_i) represent the increments of the 
random process for i G {1,2,..., k}. Then W{x) is a Wiener process if the increments Ri are 
independent Gaussian random variables with variance cr^(xj — Xi^i). The Wiener process has 
drift if E[Ri\ = v{xi — where v is the drift velocity. The Wiener process is an appropriate 

model for physical Brownian motion if friction is negligible [21]. 

The system under consideration is illustrated in Fig. 1. The transmitter releases one or more 
molecules into the fluid medium at some chosen times; the molecules then propagate to the 
receiver. The receiver notes the arrival time(s) and uses this to estimate the time(s) of transmis- 
sion. In the figure the receiver is depicted as a wall, since we assume that molecules cannot 
propagate beyond the receiver - and once a molecule arrives, it is absorbed and does not return 
to the medium. We therefore model one-dimensional propagation; however, our analysis doesn't 
change in a two- or three-dimensional environment, as long as the environment is isotropic. 
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Consider a fluid medium with positive drift velocity v and free diffusion coefficient D, where 
the Wiener process variance is given by cr^ = £)/2 (see footnote'). A molecule is released into 
this fluid at time x = at position w = Under the Wiener process, the probability density of 
the particle's position w at time x > is given by [23] 

fw{w; x) = — ==exp — . (1) 

V27ro-2a; V 2cr^x J 

That is, treating the time x as a parameter, the pdf of the position w is Gaussian with mean vx 

and variance a'^x. 

Since the receiver acts as a perfectly absorbing boundary, we are only concerned with the first 
arrival time N at the boundary. We assume that the transmitter is located at the origin, and in 
the axis of interest, the receiver is located at position d > 0. In this case, the first arrival time 
is given by 

N = mm{x : W{x) = d}. (2) 

The key observation here is that if v > 0, the pdf of A^, denoted by fN{n), is given by the 
inverse Gaussian (IG) distribution [24] 



fN{n) = { V — ' V Vn / ' ' 

0, n<0. 

where 

H = and (4) 

V 

A = (5) 

3 

The mean and the variance of N are given by m^r = /i and \/ar(N) = respectively. We will 
use lG{fi, A) as shorthand for this distribution, i.e., ~ ^G{fi, A) implies (3). It is important to 
note that if v = 0, the distribution of A^ is not IG. Furthermore, if f < 0, there is a nonzero 
probability that the particle never arrives at the receiving boundary. Throughout this paper, we 
will assume that i; > 0. 

To develop our molecular communication channel, we assume that the processes W{x) are 
independent for different molecules. The information to be transmitted is encoded in the transmit 

'in [22], values of D between 1-10 fim^/s were considered realistic for signalling molecules. 
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time of each molecule. The transmitter sends symbols X G IR+, where IR+ represents the set of 
nonnegative real numbers; the symbol X = x represents a release of a single molecule at time 
X. This molecule has initial condition W{x) = 0; the molecule propagates via a Wiener process 
with drift velocity v > 0, and Wiener process variance coefficient cr^. This process continues 
until arrival at the receiver, which occurs at time Y E M+. We assume that the propagation 
environment is unlimited and that, other than the receiving boundary, nothing interferes with the 
free propagation of the molecule. Under these assumptions, for a single molecule, clearly 



where is the first arrival time of the Wiener process. Substituting into (3), the probability of 
observing channel output Y = y given channel input X = a; is given by 



It is apparent that the channel is affected by additive noise, in the form of the random propagation 
time N; furthermore, by assumption, this is the only source of uncertainty or distortion in the 
system. As the additive noise N has the IG distribution, we refer to the channel defined by 
(6)-(7) as an additive inverse Gaussian noise channel. Note that we assume that the receiver can 
wait for infinite time to ensure that the molecule does arrive. 

The results below follow directly from this IG framework. Several of the results are based on 
properties of the IG distribution available in [24]. Previous works on the IG distribution were 
motivated by its application in diverse fields such as financial, reliability, hydrology, linguistics 
and demography [24], [25]. 



A. Main Result 

Equation (6) is reminiscent of the popular additive white Gaussian noise (AWGN) channel, a 
crucial parameter of which is the channel capacity. As in the AWGN case, the mutual information 
between the input and the output of the channel is given by 



Y = X + N, 



(6) 




(7) 



III. Capacity Bounds 



I{X-Y) 



h{Y) 



h{Y\X), 



h{Y) 



h{X + N\X) = h{Y) - h{N\X), 



h{Y) 



h{N), 



(8) 
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since X and are independent. The capacity of the channel is the maximum mutual information, 
optimized over all possible input distributions fx{x)- The set of all possible input distributions 
is determined by the constraints on the input signal X. With the information being encoded 
in the release time of the molecule, there is no immediate analog to input power for the 
AWGN channel; the constraints are application dependent, e.g., both peak-constrained and mean- 
constrained inputs appear reasonable. So far, peak constraints have not been analytically tractable; 
in this paper we constrain the mean of the input signal such that 

E[X] < m. (9) 

That is, on average we are only willing to wait m seconds to transmit our signal. Thus, we 
define capacity as follows: 

Definition 1: The capacity of the AIGN channel with input X and mean constraint E[X] < m 
is defined as 

C= max I(X:Y). (10) 

/x(x):E[X]<m 

From the receiver's perspective, E[N] is finite as long as v > 0, so (9) ensures that the 
expected time of arrival at the receiver is constrained, i.e., E[Y] = E[X] + E[N] < m + E[N]. 
Further, note that peak constraints are not possible at the receiver, since the pdf of is supported 

on [0, oo). 

Unfortunately, unlike the AWGN channel, there is no simple closed-form, single-parameter 
characterization of the AIGN channel capacity; however, we use the IG distribution to form 
bounds on the capacity. Thus, our main result in this section is an upper and lower bound on 
the capacity of the AIGN channel. 

Prior to stating this result, we need the following two properties of the IG distribution: 
Property 1 (Differential Entropy of the IG distribution): Let /iiG(/i,A) represent the differential 
entropy of the IG distribution with the parameters fi and A. Then 

h 1 (OU^ /^ / N ^ , 3 j^i^7(V^)|7=^l/2 , A /ii/2(A//i) + A-,3/2(A//i) 

/.,o(„A) = log (2A_v.(A//.)/.) + - + ^ K:;i;m ' ^''^ 

where K^{-) is the order-7 modified Bessel function of the third kind. ■ 
This property is easily derived from the differential entropy of a generalized IG distribution; see 
Appendix A. An expression for the derivative of the Bessel function with respect to its order, 
needed in the second term of (11), is given in [26]. 
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Property 2 (Additivity property of the IG distribution, from [24]): Let Ni ~ IG(/ij,Aj),i = 
be / not necessarily independent IG random variables and = n for all i, and let 
N = J2i c^Ni, > 0. Then N ~ IG(^^ q/x,, c^^f). ■ 
The bounds on the capacity C are then given by the following theorem. 
Theorem 1: The capacity of the AIGN channel, defined in (10), is bounded as 

hIG{m+^l,{x/^,^){m+^J.)2) - hiG{,i,x) <C< log((/x + m)e) - /iiG{^»,A), (12) 

where /iiG(/i,A) is given by Property 1 . 
Proof: From (8), 

I{X-Y) = h{Y)-hoi,,x), (13) 

with /iiG{^j,A) given by Property 1. /(X; Y) is therefore maximized by maximizing h{Y) subject 
to the constraint given by (9), equivalently E[Y] < m+yU. Hence, I{X; Y) achieves its maximum 
value when h{Y) is maximized subject to the following two constraints: first, = 0,y < 0, 

and second, E[Y] < m + fi. 

For the upper bound, for a random variable with a mean constraint, it is known that the expo- 
nential distribution, defined over the interval (0, oo), is the entropy maximizing distribution [27]. 
Let Y ~ exp(l/m + /i)); then h(Y) = log((m + /i)e) > h(Y) for any possible distribution of 
Y with E[Y]=m + fx. Thus, 

C < log((m + ^)e) - /iiG(M,A). (14) 

For the lower bound, suppose the input signal X is IG distributed with mean equal to m, 
satisfying (9). Choose the second parameter of the IG distribution for the input signal X as 

{\/^i'^)m'^ i.e., X ~ IG(m, {\/fi^)m'^). Then from Property 2, F ~ IG(m + /i, {\/ fi'^){m + ^if) 
and h{Y) = hiQ^rn+fi,{x/n'^){m+fj,)2)- The mutual information is given by 

H^'^ ^) = ^IG(m+/x,(A/Ai2)(m+Ai)2) — ^IG(m,A) (15) 

Note that /y(y) in this case is not necessarily an entropy maximizing distribution for a given 
mean of m + /i, and hence 

C > /ilG(m+Aj,(A/^2)(m+/i)2) — ^IG(/i,A)- (16) 

The theorem follows from (14) and (16). ■ 
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Note that if one could find a valid pdf for X (with E[X] < m) that resulted in an exponential 
distribution for Y (via convolution with the IG distribution of N) then the expression in (14) 
would be the true capacity for mean constrained inputs. For example, at asymptotically high 

3 

velocities, i.e., as v oo, jj, = d/v — t- and the variance Var(A^) = ^ > 0, i.e., the noise 

distribution tends to the Dirac delta function. The fact that ^ — )■ 1 as — )■ oo is proven in [25]. 
The fact that Y is distributed exponentially then leads to the conclusion that, at high drift 
velocities, the optimal input X is also exponential, i.e., X ~ exp(l/m). 

At low velocities, the situation is considerably more complicated. As shown in Appendix B, the 
deconvolution of the output (Y) and noise (A^) pdfs leads to an invalid pdf, i.e., at asymptotically 
low velocities, this upper bound does not appear achievable. 



B. Numerical Results 

We now present numerical results by evaluating the mutual information of the AIGN channel 
and, in order to illustrate the upper and lower bounds, we consider four cases: 

1) F ~ exp(l/(m + /i)), 

2) F ~ IG(m + /i, (A//i2)(m + /i)2), 

3) X is uniformly distributed in the range [0,2m], 



4) X is exponentially distributed with mean m, i.e., X ~ exp(l/m) with v > ^/2o^Jm. The 
need for this constraint is explained below. 
In all the four cases, m = 1. The first two choices correspond to the upper and lower bounds 
in Theorem 1, respectively. The final two choices also provide lower bounds on the capacity, 
though in these cases we can only express /y(y) (and not h{Y)) in closed form; numerical 
integration must be used to calculate mutual information. In the case where X has the uniform 
distribution on [0,2m], convolving the input and noise distributions leads to 



h^Niy), y < 2m; 



fy{y)={ ~ (1^^ 

^ {F^iy) - F^iv - 2m)) , y>2m. 
where F]\f{n) is the cumulative distribution function (cdf) of N and is given by [24] 

FM ^<,(,[I(!l-l]]+ + (18) 
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Fig. 2. Mutual information as a function of velocity; — 1. 



where $(2;) = | + erf y^jj the cdf of a standard Gaussian distributed random variable 
Z. In the case where X ~ exp(l/m) with m > 2(T^/f^, the convolution leads to [28] 

fyiy) = le(--^^i^) fe-^^/-^$ f + e^^^/^^^ ( ^tllA)) (19) 

where k = — The constraint on velocity, > 2a'^/m ensures real k. 

Figure 2 plots the mutual information as a function of velocity for the four cases listed above. 
The upper and IG lower bound are close to each other only over a narrow range of velocities. 
Further, the cases with exponential and uniform inputs track the upper bound, with the exponential 
input approaching the bound at high velocities. This is consistent with the discussion in the 
previous section. However, given its finite support, a uniform input may be closer to a practical 
signalling scheme. Unsurprisingly, the plot shows that velocity is an indicator of channel quality 
in that the mutual information increases without bound as velocity increases. As a caveat, this 
understanding may be valid only at higher velocities; the upper bound is not monotonic, and at 
very low velocities the the upper bound actually decreases with increasing velocity. 

The complicated relationship between mutual information and velocity arises because, unlike 
AWGN channels, there is no single parameter like SNR that determines the mutual information. 
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Fig. 3. Mutual information as a function of diffusion constant, (v = 1). 

The pdf in (3) is a function of both velocity (via //) and diffusion constant, (via A). An 
example of this complex relationship is shown in Fig. 3, where v = 1. Both the upper bound 
and the mutual information with uniform inputs fall with increasing diffusion (randomness), but 
then further increasing diffusion increases mutual information. 

The increase in mutual information as a function of diffusion is counterintuitive since diffusion 
is assumed to be the source of randomness. To understand this result it is instructive to consider 
the zero-velocity (no drift) case. Without diffusion, the molecule would remain stationary at 
the receiver, never arriving at the receiver, and result in zero mutual information. In this case, 
increasing diffusion helps communication. So, while it is true that diffusion increases randomness, 
its impact is not monotonic. To illustrate this effect, consider Fig. 4. Here, the velocity is set 
relatively high (v = 10). The plots are the entropies and mutual information (upper bound) as a 
function of the diffusion constant. Here, the upper bound falls steeply until ~ 4, very slowly 
until cr^ ~ 10 and then rises slowly for increasing cr^. This is because for relatively large values 
of cr^, this velocity appears "low" and increasing diffusion increases mutual information. This 
is confirmed by the falling entropy of the noise term (h{N)). 



■A — Upper bound; Y Exponential 
^ — Lower bound; X IG 
-e — Uniform X 
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Upper bound on l(X;Y); Y exponential 

- - h(N) 

-e — h(Y) in case of upper bound 




Fig. 4. Mutual information as a function of diffusion constant, a {v = 10). 



To summarize, in this section we developed capacity bounds for the AIGN channel based on 
the IG distribution of the molecule propagation time. While increasing velocity increases mutual 
information, increasing diffusion beyond a point also increases mutual information. Unlike the 
AWGN channel, no single parameter captures the performance of the AIGN channel. 

IV. Receiver Design 

We now discuss receivers for this channel by recovering the transmitted message (transmission 
time) from the times the molecules are received. We develop both the maximum likelihood (ML) 
estimator and the ML detector, and provide an error probability analysis for the ML detection. 

A. Maximum Likelihood Estimator (MLE) 

The ML estimator of X, denoted by Xml, is given by 

Xml = arg max fY\xiy\X = t), (20) 

where 



December 10, 2010 



DRAFT 



13 



and fY\x{y\X = t) = for y < t. The pdf given above is commonly known as the shifted IG 
distribution, or the three-parameter IG distribution, and is denoted as IG(to, yU, A) where to is the 
location parameter [24], or the threshold parameter [25]. The mean of the shifted IG distribution 
is n + t. 

Theorem 2: Let Xml represent the ML estimate of the transmitted symbol X in an AIGN 
channel. Then 



/i^ / 3 _ /9 A2 



^ML = y + ^U-W7 + -|- (22) 



Proof: Let A(tj) = \og fY\x{y\X = ti) represent the log-likelihood function. Since log is 
mono tonic, 

Xml = argmax fY\x{y\X = U) = argmaxA(ti). 

In our case, 

K(ti) = I ^ ^ ^^2^ (y-u) y (23) 
[ -oo, y < ti. 

By setting ^^^^ = 0, and searching over values of tj < y, we obtain the MLE given by (22). ■ 
This result is consistent with the expected high velocity case (v — )■ oo), wherein Xml = y- 



B. ML Detection: Symbol Error Probability Analysis 

Analogous to the use of a signal constellation in AWGN channels, we now restrict the input 
to the channel, i.e, the transmission time, to take discrete values: for T-ary modulation we have 

X e{tu...,tT},0<ti<t2,...<tT. 

Using such a discrete signal set, we analyze the error probability for binary modulation with 
ML detection at the receiver. Let X E {^1,^2}, < ti < t2, with Pr(X = ti) = pi and 
Fr{X = ^2) = P2- The log-likelihood ratio L{y) is given by 

f{y\x = t2) 



L{y) = log 



f{y\x = t^) 

A(t2)-A(ti) 

-00, y < 

iogfz| + 2^^(/^^fe-^)+^i-0. y>t2. 



(24) 
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If L{y) is positive (negative), then ^2 has higher (lower) likelihood than ti. If L{y) = 0, then 
there is no preference between ti and t2', we ignore this case, which occurs with vanishing 
probability. Thus, for ML detection, the decision rule is: 

Pick X = ^2 if L(y) > 0, otherwise pick X = ti. 
For MAP detection, we use the same decision rule, replacing L{y) > with L{y) > \og{pi/p2). 
The symbol error probability (SEP) is given by 

Pe = PiPr{ti ^t2}+ P2Mh ^ ti}, (25) 
where Pr{tj — tj} is the probability of Xml = tj when X = U. 

POO 

PT{h^t2}= / fYiy\X = h)dy (26) 

where yth is the decision threshold value of y, satisfying L{yth) = 0. Similarly, 

rvth 

Pr{t2-^ti}= / fYiy\X = t2)dy. ill) 

We now give an upper bound on the error probability for the case when pi > p2, which is simple 
to calculate and yet closely approximates the exact error probability. 

Theorem 3: Let X e {^1,^2}, < ti < t2, with Pr(X = ti) = pi, Py{X = ^2) = P2 and 
Pi > P2- The upper bound on the symbol error probability of the ML detector in an AIGN 
channel with input X is given by 

Pe<Pl{l-FM{t2-h)). (28) 



Proof: To prove (28), let 

/•oo POO 

6= / fYiy\X = t,)dy- fY{y\X = t^)dy 

>^<2 J Vth 

rvth 

= / fY{y\X = t,)dy. (29) 

Then 

POO 

Pr{ti->t2} = / fY{y\X = U)dy 



Vth 
00 



fY{y\X = h)dy-6. (30) 



t2 
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Note that 6 > since yth > ^2- Furthermore, 



yth 



Mh^h} = / fY\x{y\X = t2)dy 

Jt2 

rvth 

< / fY\x{y\X = h)dy (31) 
= (32) 

where (31) follows since, under ML detection, fY\x{y\X = ti) < fY\x{y\X = ^2) when y < yth- 
Finally, (25) becomes 

Pe = PiPr{ti ^t2}+P2Pr{t2 ^ti} 

< Pi(^j^°° fY{y\X = h)dy-6^+p2S (33) 

00 

t2 
00 



= Pi fY{y\X = ti)dy - {pi - p2)6 

POO 

< Pi / fY{y\X = ti)dy, (34) 



t2 



where the last inequality follows since pi > p2 (by assumption), and so {pi—p2)S is non-negative. 
Finally, note that J^^ fy^ylX = ti)dy = 1 - FN{t2 - ti), and (28) follows. ■ 
Corollary 1: The bound in (28) is asymptotically tight as i; — 00, i.e., 

lim (Pe -Pi{l- FM{t2 - h))) = 0. (35) 



Proof: The error in bound (33) is at most p25, and the error in bound (34) is equal to 
{pi — P2)S; thus, the total error is at most P2S. Noting that // — )■ as — )■ 00, we show that 
5 — > as /X — 0. For y > t2, we have 



A / X{y-h-2^i)\ f X 
exp — exp 



WTr{y-tiy^ V 2/i2 J \ 2{y-t 



- ^^ Mt2-to^ ^"H — V — )■ ^''^ 

Finally, 5^0 follows from substituting (36) into (29): since t2 — h > (by assumption), then 
fY\x{y\X = ti)^0 for all y > t2 as fi ^ 0, and (35) follows. ■ 
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Fig. 5. Deriving thie upper on symbol error probability; ti = 0, 12 = 1, v = 1, = 1 and d = 1. 

To illustrate this result, consider Fig. 5: 5 is the area under the curve f{y\X = ti) as y 
varies from t2 to yth and is always larger than J^^"" /y(t/|X = t2)dy, the area under the curve 
fviylX = t2) from t2 to yth- 

This bound can easily be generalized to T-ary modulation. When X G {ti, . . . , tr}, < ti < 
^2, • • • <tT and > P2 > • • • > Pt^ the upper bound on symbol error probability is given by 

T-l 

Pe<^Pi{l- Fn {U+l - ti)) . (37) 

1=1 

To compute the ML estimate, the receiver needs to know i^i and A, the parameters of the noise. 
One way to enable the receiver to acquire the knowledge of these parameters is by training 
as in a conventional communication system. Appendix C provides the ML estimates of these 
parameters based on the IG pdf. 

C. Improving Reliability: Transmitting Multiple Molecules 

The performance of a molecular communication system (the mutual information and the error 
rate performance) can be improved by transmitting multiple molecules to convey a message 
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symbol. We assume that the trajectories of the molecules are independent and they do not 
interact with each other during their propagation from the transmitter to the receiver. 

The transmitter releases M > 1 molecules simultaneously to convey one of T messages, 
X E {ti, . . . , tr}. In [9], it was shown using simulations that if multiple molecules are available, 
releasing them simultaneously is the best strategy. Essentially, releasing them at different times 
leads to confusion at the receiver with molecules potentially arriving out of order. In the case 
of simultaneous transmissions, the receiver observes M mutually independent arrival times 

Y,=X + N,, j = l,...,M, (38) 

where Nj are i.i.d. with Nj ~ IG(/i, A), j = 1, . . . , M. 

1) Maximum likelihood estimation: We first consider ML detection of the symbol when 
multiple molecules are used. Assuming that the receiver knows the values of fi and A through 
an earlier training phase, it can use the multiple observations Yj,j = 1, . . . , M, to obtain Xml- 

The pdfs fYj\xiyj\X = U),] = 1, . . . , M, are i.i.d. with fY^\x{yj\X = U) given by (21). The 
ML estimate, in this case, is given by 



M 

Xml = arg max J]^ fy^ \x iyj\X = U 



= argmaxn(y, -t,)-3/2exp f-^^ E ^^^^^r^^ , > U (39) 

' j=\ \ j=i ~ ) 

Simplifying the above equation, the ML estimate can be expressed as 

Xml = arg max Am (t,) (40) 

where 



2) Linear filter: The above approach estimates the transmitted message using a complicated 
ML detection filter that processes the received signal. Given the potential applications of this 
research, a simpler filter would be useful. One such filter is the linear average, which is optimal 
in an AWGN channel [29]. In this case, the receiver averages the M observations and performs 
a ML estimate with the sample mean as the test statistic. The receiver generates 

M 

(42) 
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The linear filter has the following nice property: by the additivity property of IG distribution in 
Property 2, Z ~ IG{E[X] + /i, MA). Now, 



Xml = argmaxf z{z\X = U), 



where 



The linear receiver therefore acts as if the diffusion constant, cr^, is reduced by a factor of M to 
a'^/M. At reasonably high velocities, this leads to better performance; however, we have seen 
in Section III that, at low velocities, diffusion can actually help communications. 

At high drift velocities the reduction in the effective diffusion results in an effect akin to the 
diversity order in wireless communication systems. This is shown in the following result. 

Theorem 4: As drift velocity f — oo, 

2 2 
CV CV 

log(Pe) < -Ci— + C2 + C3l0g— , (44) 

where Ci,C2 and C3 are constants. 

Proof: The proof is found in Appendix D. ■ 
Furthermore, for M molecules and detection using the linear filter, 

log(P,) < -Ci^ + C, + Cs log (45) 

which is essentially (44) with cr^ replaced by a'^/M. 

Since, in both (44) and (45), the first term dominates at high velocities, a semi-log plot of Pg 
versus velocity is asymptotically linear, with slope proportional to —M. 

D. Simulation Results 

Figure 6 shows how the variance and the mean of the ML estimate vary with velocity for a 
given cr^. With increasing velocity, the estimator becomes unbiased and the variance approaches 
zero. As in Section III, velocity appears to be close to the AIGN equivalent of SNR in AWGN 
channels; however, again, this is only true at high velocities. At low velocities, both the velocity 
and the diffusion constant play a role. 

Figure 7 plots the symbol error probability with T-ary modulation for different values T. 
The input alphabet employed for simulations is X e {1 + = l,...,r}. The figure 
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^ — Standard deviation; X = 1 , o = 1 
+ -Mean;X = 1,o^ = 1 
^ -|vlean;X = 2,o^ = 1 
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Fig. 6. Mean and standard deviation of Xml- 
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Fig. 7. Comparing the analytical upper bound and simulated enor probability; single molecule case with T-ary modulation. 
Equiprobable symbols and = 1. 



December 10, 2010 



DRAFT 



20 



— M=1 ML Detection 
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Velocity 

Fig. 8. Comparing the error probability of MLE with the averaging filter. Equal a priori probabilities and = 1. 

also compares the upper bound on error probability, presented in Section IV-B, with the error 
probability obtained through Monte Carlo simulations. The rapidly deteriorating error probability 
is clear, as is the tightness of the upper bound. 

The poor performance of T-ary modulation as shown in Fig. 7 motivates the multiple molecule 
system described in Section IV-C. Figure 8 plots the error rate performance when X G {1,2} 
and each symbol is conveyed by releasing multiple molecules. As expected, there is a effect 
akin to receive diversity in a wireless communication system. Here, the performance gain in the 
error probability increases with the number of molecules transmitted per message symbol. 

Figure 8 also compares the performance of the averaging filter with the ML estimation given 
by (40). The linear averaging filter is clearly suboptimal with performance worsening with 
increasing number of molecules transmitted per symbol (M). This result again underlines the 
significant differences between the AIGN and AWGN channel models. 

V. Discussion and Conclusions 

In proposing a new channel model based on IG noise, we have necessarily analyzed the 
simplest possible interesting cases. In this regard, there are several issues left unresolved. 
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Single versus Multiple Channel Uses: Throughout this paper, we have focused on the case of 
a single channel use, in which we use the channel to transmit a single symbol of information; 
our capacity results are measured in units of nats per channel use. Translating these results to 
nats per molecule is straightforward: each channel use consists of a deterministic number of 
molecules M, where M > 1, thus, we merely divide by M. However, measuring nats per unit 
time is a more complicated issue, since the duration of the channel use is a random variable, 
dependent on both the input and the output. Following [19], where the capacity per unit time 
of a queue timing channel was calculated with respect to the average service time, here we can 
normalize our capacity results either with the average propagation time -^[A^], or the average 
length of the communication session E[Y]. Since E[Y] = E[X]+E[N], our decision to constrain 
the mean of the input distribution fx{x) would then have a natural interpretation in terms of 
the capacity per unit time. 

Further, our system model excludes the possibility of other molecules propagating in the 
environment, except those transmitted as a result of the channel use; equivalently, we assume 
each channel use is orthogonal. This raises the question of how to use the channel repeatedly: if 
the signalling molecules are indistinguishable, then (under our formulation) the transmitter must 
wait until all M molecules have arrived before a new channel use can begin. On the other hand, 
if the signalling molecules are distinguishable, then channel uses can take place at any time, or 
even the same time. This is because, if there is no ambiguity in matching received molecules to 
channel uses, those channel uses are orthogonal. 

Inter-symbol Interference: Repeated channel uses also leads to a situation akin to inter-symbol 
interference (ISI) in conventional communications. Since propagation time is not bounded, the 
transmitter may release the molecule corresponding to the "next" symbol while the "previous" 
molecule is still in transit. Molecules may, therefore, arrive out of order. This problem is 
exacerbated if multiple molecules are released simultaneously to achieve diversity. Decoding 
with such ISI is complex since schemes such as the Viterbi algorithm cannot be used (even 
ignoring the fact that the system would, in theory, have infinite memory). This is because, in 
each time slot, the number of molecules not yet received - due to transmission from previous 
time slots - acts as the state of the channel with corresponding noise distributions. In other 
contexts, an example of a channel with states is the Gilbert-Elliott channel [30]. 

Synchronization and Differential Encoding: The system model and the analysis presented here 
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assumes perfect synchronization between the transmitter and the receiver. It is unclear how 
difficuk, or easy, it would be to achieve this with nano-scale devices. An information theoretic 
analysis of the effect of asynchronism in AWGN channels has been presented in [31]. Given 
the importance of timing in our model, extensions of such work to the AIGN channel would 
be useful. An interesting alternative would be to use differential modulation schemes such as 
interval modulation presented in [32]. 

Amplitude and Timing Modulation: The work presented here focuses on timing modulation, 
which leads naturally to the AIGN channel model. A more sophisticated scheme would be to use 
"amplitude" modulation as well - such as by varying the number of molecules released. It may be 
possible to leverage work on positive-only channels such as in optics [33]. Amplitude modulation 
could be coupled with the timing modulation considered here. However, it is important to note 
that any amplitude information would reproduced at the receiver faithfully since, in the model we 
have considered so far, the receiver is allowed to wait for all molecules to arrive before decoding. 
Therefore, to be useful, a reasonable model of amplitude modulation must also include receiver 
imperfections and account for the issue of ISI as described above. 

Two-way Communication and Negative Drifts: The AIGN channel model is valid only in the 
case of a positive drift velocity. In this regard, it does not support two-way communication 
between nano-devices. With zero drift velocity, the mean transition time is unbounded, but 
the probability that the molecule arrives approaches 1; with negative drift velocities, even this 
arrival is not guaranteed [24]. Molecular communications with negative drift velocities remains 
a completely open problem and one that is outside the scope of this paper. In this case, the noise 
term is IG(— yU, A) and the IG framework provided here may be used to analyze such a problem. 

In conclusion, our results both illustrate the feasibility of molecular communication and show 
that it can be given a mathematical framework. However, our results lead to many interesting open 
questions, some of which are described above. We believe our key contribution here has been 
to provide this mathematical framework, making it possible to tackle some of these problems. 

Appendix A 
Differential entropy of the IG distribution 

Here we prove Property 1. For a given /i and A, the differential entropy of the noise h{N) 
is fixed and can be computed from the generalized IG distribution (GIG). The GIG distribution 
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is characterized by three parameters and the pdf of a random variable X distributed as GIG is 
given by [24] 

fx{x; 7, n, X) = -r-r- exp 



-cx)<7<oo,/i>0, A>0,x>0, (46) 

where K^{-) is the modified Bessel function of the third kind of order 7. It is commonly denoted 
as GIG(7,/i, A) and IG(/i, A) is a special case, obtained by substituting 7 = —1/2 [24]. 
When X ~ GIG(7,/i, A), its differential entropy, in nats, is given by [34] 

MJf ) = log (2K.M,M - (7 - + T, A,(A/,] 

Setting 7 = —1/2, the differential entropy ofA^~IG(//,A)is given by 

/i(A^) = hiG^^^x) = log (2A'_i/2(A//i)/i) 



3^^^7(V^^^k=-l^ A iri/2(A/^)+i^-3/2(A//i) 

2 i^'-i/slA/zi) ^2/i i^-i/2(A//i) 

(48) 



and the property follows. 



Appendix B 

Evaluating optimal input distribution at low velocities 

If a pdf exists that leads to an exponentially distributed measured signal Y, it would be 
the capacity achieving input distribution. Furthermore, the pdf of the measured signal is the 
convolution of the pdf of the input and that of IG noise pdf. We therefore attempt to evaluate 
the optimal distribution at asymptotically low velocities by deconvolving the known optimal 
distribution (exponential) of the output Y and the IG noise. The Laplace transform of the IG 
distribution is given by 



C{N) = E[e-'^] = exp 



(49) 



For given values of cr^ and (i, as f — )■ 0, /i — )■ 00 and 7 is fixed. In such a case, C{N) can be 
approximated as 

C{N) ^ exp (^-V2X^^ . (50) 
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As Y = X + N, C{X) = C(Y)/C{N). To achieve the upper bound on capacity, fyiv) 
J-e"^Y ^ where my = E[Y] = E[X] + and hence 



C{Y) = -^i^ C{X) = TT^^^exp (v^) (51) 
^ ^ s + (l/my) ^ ^ (l/my) + s / 

and the pdf of X can be obtained by computing the inverse Laplace transform C'^{X). The 

inverse Laplace transform can be computed by making use of the following Laplace transform 

pair [35]: 



+ exp (^cVaTtj erfc + V(a + 6)t^ ^ , (52) 



where a, 6 and c are constants. Using(52), we obtain 



exp(v/2Av/i)} = (1Z!^^|)^ (exp (jv/2A7^) erfc {-y^t - j^V^ 
+ exp (-jv^2A/my) erfc (^->/\/2i + j^/t/my^^ (53) 



where 

r>00 



erfc(2;) = / dz 



Note that erfc(2;) can be evaluated for complex values of its argument z and erfc(2;*) = (erfc(2))*, 

where z* is the complex conjugate of z. Hence 

-1 

fxix) = jexp (jV2A/my) erfc - 3^x/my^ } . (54) 

This, unfortunately, does not appear to be a valid pdf. The capacity of the AIGN channel at low 
velocities is therefore, yet, unknown. 

A. When there is no drift 

To confirm the result in (54), we test the case of zero velocity. Note that in this case, the noise 
is not IG; however, the zero velocity case converges in limit to the case without drift. Without 
drift, the arrival time has a pdf given by [24], 



/(t) = \/^exp^, t>0 (55) 
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Note that t ~ Inverse Gamma(l/2, A/2). The inverse Gamma distribution, with shape parameter 
a and scale parameter /3, is given by 

fit- a, (3) = -^(l/t)"+iexp(/3/t), t > 0. (56) 

r(«) 

Hence, the Laplace transform of the inverse Gamma distribution is 

C{N) = C[InvGamma{l/2, A/2)] = ^ — K\/2{V2}^). (57) 

Substituting 



Ky2iz) = ^^e-^, (58) 

we get 

jr{N) = e"^ (59) 



This results in 



^^^^ = s + {l/myf ^^^^ 



Note that (59) is same as (50) and (60) is same as (51). Hence, we get (54) when we try to 
obtain fx{x) by evaluating C^^{X). 

Appendix C 
Estimating Noise Parameters 

To estimate the noise parameters, the transmitter releases k "training" molecules at known 
time to. Let the receiver observe Yj = tQ + Nj,j = 1,2, . . . , k, where Nj ~ IG(/i, A) are i.i.d. and 
the receiver knows to a priori. The pdf's of {Yj — to), j = 1, . . . , A;, are i.i.d. and IG distributed 
as given by 



In general, oo < to < —oo; however, in our case, Q < t^ < oo. When Y ~ IG(to,At, A), 
my = E\Y] = /i + to- When the receiver knows the value of to, the ML estimates of the 
remaining two parameters ^ and A can be obtained as 

/i(to)=F-to, (62) 
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where Y = j: Yl'j=i ^ the sample mean and 



A(«o) 



-Ef- 



(63) 



Assuming and A does not change significantly from the time the receiver estimates the 
parameters and the time of actual communication, the receiver can obtain the ML estimate 
of the release times of the molecules. 



Appendix D 
Upper Bound on Asymptotic Error Rate 

Here we prove Theorem 4. Recall that, for 2-ary modulation with X E {^1,^2)50 > ti > t2, 
the upper bound on SEP is given by 



Pe<piii-FNih-h)). 



where 



\ ( n 



n \/i 



(64) 



(65) 



where $(2;) = ^ ^1 + erf {^-^j^^ is the cdf of a standard Gaussian distributed random variable 
Z. Here, 

(66) 



erf(2;) = I e '^"du 

'T^ Jo 



For z ^ 1, erf(z) can be approximated as 



erf(z) ^ 1 



(67) 



Now, we compute F]\f{c), c = t2 — ti, and examine its behavior as f — )■ 00. Recall that /i = ^ 



and A = 4. 



d^ 



(68) 



Consider the first term in Fm(c 



(7^ V ca^ 



rf2 



- I 1 + erf 



V2 



(69) 
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When V ^ oo, \ %- oo and thus 
given by (67) to obtain 

^ 1 

Now, consider the second term in Fn{c). 




1. Hence, we use the approximation 




ci)^ I 2vd d~ 



(70) 




(71) 



When i; ^ oo J + > 1 and, using the approximation given by (67), we obtain 



(72) 



Hence, 



Fn{c) 



As ev 




(73) 



decays faster than ev ^ ^ ^ the second term in the above equation 



dominates the rate at which Fn{c) goes to 1 as f — j- oo. At high velocities, F^{c) can be 
approximated as 

F^{c) ^ 1 - (74) 



1 e o-^ 



The theorem follows 



Thus, at high velocities, the upper bound on SEP is given by 
by taking the logarithm of this expression. 
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